Mining discriminative items in multiple data streams
نویسندگان
چکیده
منابع مشابه
MINING DISCRIMINATIVE ITEMS IN MULTIPLE DATA STREAMS by Zhenhua
How can we maintain a dynamic profile capturing a user’s reading interest against the common interest? What are the queries that have been asked 1, 000 times more frequently to a search engine from users in Asia than in North America? What are the keywords (or tags) that are 1, 000 times more frequent in the blog stream on computer games than in the blog stream on Hollywood movies? To answer su...
متن کاملMining Noisy Data Streams via a Discriminative Model
The two main challenges typically associated with mining data streams are concept drift and data contamination. To address these challenges, we seek learning techniques and models that are robust to noise and can adapt to changes in timely fashion. In this paper, we approach the stream-mining problem using a statistical estimation framework, and propose a discriminative model for fast mining of...
متن کاملFinding Frequent Items in Data Streams
We present a 1-pass algorithm for estimating the most frequent items in a data stream using very limited storage space. Our method relies on a novel data structure called a count sketch, which allows us to estimate the frequencies of all the items in the stream. Our algorithm achieves better space bounds than the previous best known algorithms for this problem for many natural distributions on ...
متن کاملFinding frequent items in data streams
The frequent items problem is to process a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Many applications rely directly or indirectly on finding the frequent items, and implementations are in use in large scale industrial systems. However, there has not been mu...
متن کاملFinding Persistent Items in Data Streams
Frequent item mining, which deals with finding items that occur frequently in a given data stream over a period of time, is one of the heavily studied problems in data stream mining. A generalized version of frequent item mining is the persistent item mining, where a persistent item, unlike a frequent item, does not necessarily occur more frequently compared to other items over a short period o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: World Wide Web
سال: 2010
ISSN: 1386-145X,1573-1413
DOI: 10.1007/s11280-010-0094-0